Stable Isotopomers of myo-Inositol Uncover a Complex MINPP1-Dependent Inositol Phosphate Network

The water-soluble inositol phosphates (InsPs) represent a functionally diverse group of small-molecule messengers involved in a myriad of cellular processes. Despite their centrality, our understanding of human InsP metabolism is incomplete because the available analytical toolset to characterize and quantify InsPs in complex samples is limited. Here, we have synthesized and applied symmetrically and unsymmetrically 13C-labeled myo-inositol and inositol phosphates. These probes were utilized in combination with nuclear magnetic resonance spectroscopy (NMR) and capillary electrophoresis mass spectrometry (CE-MS) to investigate InsP metabolism in human cells. The labeling strategy provided detailed structural information via NMR—down to individual enantiomers—which overcomes a crucial blind spot in the analysis of InsPs. We uncovered a novel branch of InsP dephosphorylation in human cells which is dependent on MINPP1, a phytase-like enzyme contributing to cellular homeostasis. Detailed characterization of MINPP1 activity in vitro and in cells showcased the unique reactivity of this phosphatase. Our results demonstrate that metabolic labeling with stable isotopomers in conjunction with NMR spectroscopy and CE-MS constitutes a powerful tool to annotate InsP networks in a variety of biological contexts.

This Supplementary Information (SI) contains all information on the numerical evaluation of the reaction rates for the InsP 5 [2OH]-and InsP 6 -dephosphorylation from the experimental data. For additional information on the experimental part and figures S1 -S12, please consult the other SI file. Here, we explain the theoretical background of the applied model and the procedure and assumptions that let to the final results presented in main part Fig. 6a and 6b. In this SI, we introduce a six-digit binary representation of the structure names as shown in S18. The numbers in the binary code represent the groups attached to the Cyclohexane scaffold, where the number "0" encodes the hydroxyle group -OH and the number "1" encodes the phosphoryl group -OPO 2− 3 . The position of the number in the binary code (read from left to right) corresponds to the position of the corresponding group in the Cyclohexane scaffold (see S18). For example, the binary representation of Ins(1,2,5,6)P 4 reads 110011 and the binary representation of Ins(4,6)P 2 is 000101.  The network contains all possible intermediates and products including their connection pattern. Each line in S19 represents a rate that describes the reaction from the higher phosphorylated InsPx to the lower phosphorylated InsPx. Since MINPP1 is a phosphatase, the respective reverse reactions are neglected and the network is to be read from top to bottom. The full network contains a total of 75 rates and 31 species, 10 of which have been identified in the NMR-experiments (blue boxes). We assume that the dephosphorylation network of InsP 5 [2OH] is dominated by these 10 observed InsPx (see S8) and the corresponding 13 rates (blue lines).

Full InsP 6 dephosphorylation network
S20 depicts the full reaction network of the MINPP1-mediated dephosphorylation of InsP 6 in the binary representation. The network contains all possible intermediates and products including their connection pattern. Each line represents a rate that describes the reaction from the higher phosphorylated InsPx to the lower phosphorylated InsPx. Since MINPP1 is a phosphatase, the respective reverse reactions are neglected and the network is to be read from top to bottom. The full network contains a total of 186 rates and 63 species, 14 of which have been identified in the NMR-experiments (blue highlighted boxes) with symmetrically and asymmetrically 13 C-labeled InsP 6 . We assume that the S37 S 20: Full reaction network with all theoretically possible intermediates and reaction rates of the MINPP1-mediated dephosphorylation of InsP 6 . The structures highlighted in blue have been identified in the NMR-experiments with the blue lines depicting the corresponding connection pattern. The InsP 5 dephosphorylation network (S19) is highlighted in green.
InsP 6 dephosphorylation pathway is dominated by the observed 14 InsPx (see S9) and the corresponding 21 rates (blue lines). Additionally, S20 compares the InsP 6 dephosphorylation network (highlighted in blue) to the InsP 5 [2OH] dephosphorylation network (highlighted in green). We can clearly see that the two networks do not overlap and therefore do not share a single structure or rate.

Master equation formalism
All processes in the InsP 5 [2OH] dephosphorylation network (SI Fig 19) as well as in the InsP 6 dephosphorylation network (S20) are irreversible chemical reactions of the type where species A j reacts to species A i with the rate constant k ij . N is the total number of species within the network. Please note that we start counting from zero to present the theory in line with the implementation of our analysis in Python3. The rate constants k ij are the matrix elements of the rate matrix K ∈ R N ×N . To ensure mass conservation the diagonal elements of K are defined as the negative of the sum of all other elements in the same column (constraints to master equation) such that the sum over each column evaluates to zero. In other words, K is columnnormalized to zero. The vector ϕ(t) ∈ R N collects the density (or concentration) ϕ A i (t) at time t of all species. The master equation that corresponds to scheme 2.1 readṡ whereφ(t) is the first derivative of ϕ(t) with respect to time. Eq. 2.3 yields N coupled linear homogeneous first order differential equations which describe the kinetics of the entire network. Given the time series ϕ(t) (e.g. from experimental data), the master equation can be used to numerically determine the corresponding rates k ij and incorporate additional constraints. [1,2]

Propagator formalism
In the previous subsection, we describe the kinetics with the corresponding master equation, that is via the change of the density (or concentration) with respect to time. Next, we want to introduce the propagator formalism with which the time evolution of the density (or concentration) for N species can be described directly without the use of time derivatives. The solution of eq. 2.3 is given as where ϕ 0 = ϕ(τ = 0) denotes the initial condition at time τ = 0. Eq. 2.4 contains the operator which is called propagator and acts on the initial density ϕ(0) = ϕ 0 ∈ R N to yield the density ϕ τ after time τ . A given propagator P(τ ) can only propagate the density in increments of the lag time τ to yield the time series ϕ 0 , ϕ τ , ϕ 2τ , . . . , ϕ nτ with n ∈ N according to By recursively inserting each equation into the other we get We want to emphasize that, similar to the master equation formalism, conservation of mass is automatically incorporated into the propagator formalism via the rate matrix K (see eq. 2.2) such that P is column-normalized to one. In summary, given a propagator P(τ ) we can compute the density ϕ nτ at time nτ either by computing all intermediate steps as described in eqs. 2.6 or evaluate ϕ nτ directly via eq. 2.7. In other words, given all rates k ij we can use the propagator formalism to predict the progress curves of all N species in the network. [2,3] 2.5 Minimization method to numerically determine rates Let's assume we experimentally obtained the concentration of all N species within a network at different discrete times. In other words, we know the density (concentration) vectors ϕ exp 0 , ϕ exp τ , ϕ exp 2τ , . . . , ϕ exp nτ ∈ R N at times 0, τ, 2τ, . . . , nτ . From this time series, we can numerically determine the time-derivatives as a finite differencė Additionally, we can define a set of n master equationṡ where the elements of K are unknown. With eq. 2.9 we can predictφ mτ for a specific choice of K. By selecting one value of m we obtain one master equation for this specific S40 m as where we abbreviate ϕ exp mτ = x exp = (x exp 0 , x exp 1 , . . . , x exp N −1 ) T andφ mτ = y = (y 0 , y 1 , . . . , y N −1 ) T . Letφ exp mτ = y exp = (y exp 0 , y exp 1 , . . . , y exp N −1 ) T be the density vector at time mτ which was calculated numerically from the experimental data via eq. 2.8. We use the mean squared error ∆(mτ ) to measure the error of the prediction of y described in eq. 2.10 and the experimentally obtained y exp (eq. 2.8).
K is column-normalized to zero such that we can substitute the diagonal matrix elements k ii by eq. 2.2. The error in eq. 2.11 then only depends on the off-diagonal elements of K. We can now use a least-square method to minimize ∆(mτ ) with respect to these off-diagonal elements to get a rate matrix K that produces y as close as possible to the experimentally observed y exp .
With eqs. 2.10 and 2.11 we only made use of the experimental data ϕ exp (m+1)τ and ϕ exp mτ at two distinct times m and m + 1 to determine K. Next, we extend our approach to all values of m such that we can include the entire experimental time series as described in eq. 2.9. In this context, we define the overall error ∆ as a sum over all individual errors ∆(mτ ) (eq. 2.11) ∆ = ∆(τ ) + ∆(2τ ) + · · · + ∆(nτ ) .
(2.12) and minimize eq. 2.12 with respect to the off-diagonal elements k ij in order to get a good estimate for the rate matrix K. Please note, that the described least-square method is particularly effective if the unknown K is sparse, meaning if it contains a lot of zeros. Furthermore, the method allows for additional constraints (additional to column-normalization, e.g. fixing certain reaction rates to a predefined value) and upper and lower limits for the value of the unknown parameters (e.g. for reaction rates we have k ij ∈ [0, 1]). As indicated in S20, the InsP 5 [2OH] dephosphorylation is dominated by 10 different InsPx forming a network that includes 13 different rates. Consequently, the corresponding rate matrix K is 10 × 10-dimensional and sparse, which makes the minimization process described above a very well suited tool to determine the reaction rates of the kinetic network. The same argument holds for the InsP 6 dephosphorylation network which consists of 12 species and 17 rates, yielding a sparse 12 × 12-dimensional rate matrix. Finally, we want to emphasize that a good initial guess for all elements of K is crucial for the convergence behaviour of a minimization routine as described above.

Consecutive first-order kinetics
We consider the simplest example of consecutive first order kinetics which is given as where one irreversible reaction from species A 1 to A 2 with the reaction rate k 21 is followed by a second irreversible reaction from A 2 to A 3 with the reaction rate K 32 . The naming convention of the reaction scheme follows eq. 2.1. The corresponding master equation is . The diagonal elements of K are given as k 11 = −k 21 and k 22 = −k 32 (see eq. 2.2). The master equation yields a system of three coupled linear differential equationṡ with the analytic solution for the case k 21 ̸ = k 32 and the initial condition ϕ A 1 (0) = ϕ 0 A 1 . [4] S21 shows an example for the progress curves defined in eqs. 2.17-2.19 for randomly selected rates.
S 21: The progress curves for two consecutive first order reactions as defined in eq. 2.13 for a set of example rates.

Analysis protocol
To extract the reaction rates from the scaled experimental data shown in S22 and S22 we use the least-square minimization routine described in SI section 2.5 following the protocol presented below for both replicas of the InsP 5 [2OH] dephosphorylation respectively as well as for the InsP 6 dephosphorylation.
We can clearly see that S exp (t) ̸ = 175 µM for all t, meaning that we "loose" mass during the course of the experiment and conservation of mass is not fulfilled by the original experimental data. Since conservation of mass is crucial for the kinetic model we use to extract rates from the experimental data, we correct for the loss of mass by scaling the experimental data according to The scaled progress curves ϕ scaled i (t) are shown as dashed lines in S22 and in main part Fig. 5a. The results of of both replicas exhibit similar behaviour but the progress curves of replica 2 indicate slightly faster kinetics. In the main part we chose replica 1 as representative for both replicas. To extract kinetic of the InsP 5 dephosphorylation, we perform the numerical analysis on both replicas separately and compare the resulting rates in SI section 4.1. Please note that we solely use the scaled progress curves for the numerical analysis.
To prepare the scaled experimental data for the numerical analysis, we fitted the progress curves of each species with an analytic fit function. The fit functions provide access to more and time-equidistant data points and analytical derivatives for each progress curve (no numerical derivatives necessary). S23 compares the fit function to the scaled experimental data for both replicas and SI table 1 summarizes the fit functions and the corresponding fit parameters. Please note, that we used the kinetic function defined in eq. 2.18 to fit the progress curve of Ins(1,4,5)P 3 (dark green circles) meaning that the fit parameters k 1 and k 2 can already be interpreted as reaction rates. S46 SI Ins(1,4,5,6)P 4 Ins(1,4,5)P 3 Ins(1,4,6)P 3 S9 and S20 depict our assumption of the complete MINPP1-mediated InsP 6 dephosphorylation pathway and main part Fig. 5d shows the corresponding simplified version. The pathway contains the enantiomers Ins(1,2,4)P 3 and Ins(1,2,6)P 3 and the enantiomers Ins(1,2)P 2 and Ins(2,3)P 2 which can only be distinguished in asymmertrically 13 C-labeled NMR experiments. However, we base our numerical analysis on the progress curves shown in S24,top which resulted from the MINPP1 reaction with symmetrically labeled [ 13 C 6 ]InsP 6 . Consequently, both pairs of enantiomers are represented by one progress curve each, which we labeled with one representative for each pair of enantiomers. This reduces the network in S20 from 14 to 12 different species. The solid lines in S24,top, represent the experimentally measured concentration time series ϕ exp i (t) with i = 0, . . . , N −1 and N = 12. The orange line in S24, bottom represents the sum S exp (t) over the concentrations of all 12 species at each point in time. Since the original experimental data does not obey conservation of mass over the entire time axis, we scale the data according to eq. 3.2. The scaled progress curves are shown as dashed lines and are equivalent to the solid lines in main part Fig. 5c. We want to emphasize that we solely use the scaled progress curves for all further analysis. To prepare the scaled experimental data for the numerical analysis, we fitted the progress curves of each species with an analytic fit function. S25 compares the fit function to the scaled experimental data and SI table 2 summarizes the fit functions and the corresponding fit parameters. We used eq. 2.17 as fit function to fit the InsP 6 progress curve which means that we can interpret the fit parameter k as the reaction rate that quantifies the depletion of InsP 6 over time. Moreover, we used eq. 2.18 to fit the InsP 5 [3OH] progress curve and thus we can interpret the fit parameters k 1 and k 2 as reaction rates that dictate the growth and the depletion of InsP 5 [3OH] concentration in time. f (t) = a exp(−kt) a = 175.00 k = 0.000932 f (t) = at b exp(−kt) a = 6.2 · 10 −18 b = 6.697 k = 0.00460

Network:
Based on the NMR-data (see main part Fig. 3 and S22), we assume that the InsP 5 [2OH] dephosphorylation network is dominated by 10 different InsPx that form the network depicted in S26, a. All possible reactions from a higher phosphorylated InsPx to a lower phosphorylated InsPx are indicated with a line and are associated with a reaction rate k ij ̸ = 0.

Density (concentration) vector and corresponding time derivative:
We use the fit functions (SI table 1  All other matrix elements are equal to zero (white squares in S26, b).

Constraints and bounds:
Here, we report the applied constraints that yielded the best results for the reaction rates reported in main part Fig. 6a and 6b. In total, we constrained 4 rates (k 10 , k 41 , k 73 and k 42 ), which leaves us with 9 reaction rates that have to be optimized during the minimization routine.
k 10 and k 41 : Since the progress curve of Ins(1,3,4,6)P 4 evolves at very low concentrations (less than 9 µM for the entire time series), we decided to exclude this species from the analysis and S54 set the corresponding rates to zero, k 10 , k 41 = 0, for both replicas.
k 73 : As mentioned in section 3.2 , we use eq. 2.18 as fit function for the progress curve of Ins(1,3,4)P 3 (SI table 1). Since this function emerges from a kinetic model, we can interpret the corresponding fit parameters k 1 and k 2 as kinetic rates with k 1 describing the increase and k 2 the decrease of concentration. The increase in Ins(1,3,4)P 3 concentration is determined by k 32 and the decrease is determined by k 53 + k 63 + k 73 . We use the fit parameter k 2 (2.62 · 10 −4 min −1 for replica 1 and 5.53 · 10 −4 min −1 for replica 2) to constrain the rate k 73 as k 73 = k 2 − k 53 − k 63 and leave k 32 unconstraint for each replica respectively.
k 42 : During our analysis, we found that the increase in Ins(1,4,6)InsP 3 concentration was generally overestimated by the minimization routine and thus we decided to constrain the rate k 42 towards Ins(1,4,6)InsP 3 by hand. In an iterative procedure, we found that the constraint k 42 = 0.001 yields the most promising results for both replicas.
bounds: Since reaction rates are a real number between zero and one, we bound all rates to the interval k ij ∈ [10 −6 , 1].

Initial guess:
We built the rate matrix K by formulating an initial guess for each rate, used eq. 2.7 to predict the corresponding progress curves and compared these prediction to the scaled experimental data (S22, dashed lines). In an iterative procedure, we corrected the rates by hand until the set of rates produced progress curves that roughly matched the scaled experimental progress curves. The set of rates is summarized in SI table 3 and serves as initial guess for our minimzation routine.

Technical details:
We used Python3 and scipy.optimize.minimize [5] to implement the minimization routine described in SI section 2.5, where we passed eq. 2.12 as objective function to be minimized, the initial guess and bounds as described above and left all other parameters at their default settings. Since we constraints 4 of the 13 rates, the implemented minimization routine optimizes the remaining 9 rates such that the resulting rate matrix K yields progress curves that are in excellent agreement with the scaled experiment data.

S55
SI Table 3: Initial guess for all rates of the InsP 5 [2OH] dephosphorylation network in min −1 for both replicas.

Rate matrix:
To build the rate matrix K, we number all InsPx included in the network from zero to eleven in a left-to-right and top-to-bottom fashion and assign the corresponding rates according to eq. 2.1. S27,b represents these rates as blue squares. The diagonal elements (green squares) are defined via eq. 2.2 and given as All other matrix elements are equal to zero (white squares).

Constraints and bounds:
Here, we report the applied constraints that yielded the best results for the reaction rates reported in SI table 5. In total, we constrain 3 reaction rates (k 10 , k 20 and k 52 ) which left us with 14 reaction rates that have to be optimized during the minimization routine.
k 20 and k 52 : As mentioned in section 3.3, we used eq. 2.18 as fit function for the progress curve of InsP 5 [3OH] (SI table 2). Consequently, we can interpret the fit parameter k 1 as reaction rate that describes build-up of concentration and k 2 as reaction rate that describes the decrease of concentration. According to the network in S27a, the increase of InsP 5 [3OH] concentration is solely determined by k 20 and thus we constrain k 20 = k 1 = 7.4 · 10 −4 min −1 . The decrease of InsP 5 [3OH] concentration is determined by k 42 + k 52 and we set the constraint k 52 = k 2 − k 42 = 9.3 · 10 −4 − k 42 .
k 10 : We used eq. 2.17 as fit function for the progress curve of InsP 6 (SI table 2) and thus the fit parameter k represents the reaction rate that dictates the decrease of InsP 6 concentration over time. According to the network in S27,a, this decrease is described by k 10 + k 20 and we constrain k 10 = k − k 20 = 1.9 · 10 −4 min −1 .
bounds: We set the bounds k 108 ∈ [10 −3 , 1], k 119 ∈ [10 −5 , 1] and k 1110 ∈ [10 −4 , 1] to bruteforce increase the influence of these rates on the network and prevent the minimization routine from setting all of them to the lowest possible value 10 −6 . All other rates were bound to the interval k ij ∈ [10 −6 , 1].

Initial guess:
To generate a good initial guess for the unconstrained rates, we started with a reduced network that included InsP 6 , InsP 5 [4OH], InsP 5 [3OH], Ins(1,2,3,6)P 4 , Ins(1,2,5,6)P 4 and Ins(1,2,4,5)P 4 and performed a minimization run. Next, we increased the network by including Ins(1,2,3)P 3 , Ins(1,2,6)P 3 and Ins(1,2,5)P 3 and repeated the minimization, where we used the results from the previous run for k 10 , k 31 and, k 41 and the default initial guess for the remaining rates. Finally, we repeated this step with the full network which yielded a good initial guess for all 13 unconstrained rates as shown in SI table 4.

Technical details:
For all technical details, the reader is referred to section 3.4.

Results InsP 5 [2OH] dephosphorylation
The numerically determined set of rates for both replicas are presented in S29. We can see that replica 2 exhibits slightly faster kinetics than replica although both share an identical experimental setup. Excluding the rates k 95 and k 97 , both sets of rates are in good agreement (S29, b). The fastest process is described by k 20 ≈ 10 −1 min −1 which governs the reaction InsP 5 [2OH] → Ins(1,4,5,6)P 4 . This result is in good agreement with MINPP1's annotation as a phosphatase that predominantly removes the phosphoryl group at the 3-position. [6] The reaction rates of the subsequent dephosphorylation steps are separated by at least one order of magnitude, where we get k ij ≈ 10 −2 min −1 for reactions of the type InsP 4 → InsP 3 , k ij ≈ 10 −4 min −1 for reactions of the type InsP 3 → InsP 2 and k ij ≈ 10 −5 min −1 for reactions of the type InsP 2 → InsP 1 . In S28, we compare the scaled experimental data (dotted lines ) to the progress curves (solid lines) predicted from the numerically determined set of rates (eq. 2.7) for each replica. The predicted progress curves match the experimental data both qualitatively and quantitatively which strongly supports the assumption that the reaction rates in the InsP 5 [2OH] dephosphorylation network are time independent. Furthermore, the results  2.95 · 10 −3 k 41 1.00 · 10 −6 k 42 2.88 · 10 −4 k 52 6.49 · 10 −4 * k 63 5.10 · 10 −4 k 73 3.67 · 10 −4 k 74 1.56 · 10 −3 k 84 2.13 · 10 −5 k 85 5.99 · 10 −4 k 96 2.02 · 10 −4 k 97 2.22 · 10 −4 k 98 3.88 · 10 −3 k 108 1.00 · 10 −3 k 119 1.00 · 10 −5 k 1110 1.49 · 10 −4 S30 shows the comparison between the scaled experimental data (dashed lines) and the progress curves predicted by the numerically determined rates (solid lines). We can clearly see that the computed rates yield a poor representation of the experimental progress curves which strongly indicates that the applied time-constant rates model is insufficient S62 to describe the InsP 6 dephosphorylation. The shapes of the experimental progress curves already indicate a kinetic network with time-dependent rates, e.g. the InsP 6 progress curve does not represent an exponential decay as we would expect from a first-order reaction. Instead, we observe a damped decrease which could emerge from a inhibition process. As mentioned in main part (Fig. 6c) we suggest that InsP 6 itself could act as an inhibitor for the dephosphorylation of its own MINPP1-mediated intermediates. [7] This assumption is further supported by the fact, that the kinetics clearly accelerate as soon as InsP 6 is fully depleted. However, we can roughly approximate the rate at which InsP 6 is depleted as k 10 + k 20 = 9.3 · 10 −4 min −1 . Based on our results and the discussion above, we conclude that our master equation ansatz (SI section 2.5) is not capable to capture the true kinetics for the MINPP1-mediated dephosphorylation of InsP 6 and thus does not provide any more insight into the main pathways that generate the enantiomers Ins(1,2)P 2 and Ins(2,3)P 2 . For the sake of completeness, we report the numerically determined rates in SI table 5 but did not include these results in the main part of our work. We renounce to extend our ansatz in order to include inhibition processes but this kind of analysis is beyond the scope of this paper.