Mass Spectrometry Imaging Reveals Early Metabolic Priming of Cell Lineage in Differentiating Human-Induced Pluripotent Stem Cells

Induced pluripotent stem cells (iPSCs) hold great promise in regenerative medicine; however, few algorithms of quality control at the earliest stages of differentiation have been established. Despite lipids having known functions in cell signaling, their role in pluripotency maintenance and lineage specification is underexplored. We investigated the changes in iPSC lipid profiles during the initial loss of pluripotency over the course of spontaneous differentiation using the co-registration of confocal microscopy and matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging. We identified phosphatidylethanolamine (PE) and phosphatidylinositol (PI) species that are highly informative of the temporal stage of differentiation and can reveal iPS cell lineage bifurcation occurring metabolically. Several PI species emerged from the machine learning analysis of MS data as the early metabolic markers of pluripotency loss, preceding changes in the pluripotency transcription factor Oct4. The manipulation of phospholipids via PI 3-kinase inhibition during differentiation manifested in the spatial reorganization of the iPS cell colony and elevated expression of NCAM-1. In addition, the continuous inhibition of phosphatidylethanolamine N-methyltransferase during differentiation resulted in the enhanced maintenance of pluripotency. Our machine learning analysis highlights the predictive power of lipidomic metrics for evaluating the early lineage specification in the initial stages of spontaneous iPSC differentiation.

. Custom-made silicone 8-well wall adhered to an ITO slide used for cell culture Figure S2. Partial least-squares regression (PLSR) of the day of differentiation against lipid abundances Figure S3. Surface markers fluorescence intensity on a cell-by-cell basis Figure S4. Temporal and spatial changes induced by phosphatidylinositol 3-kinase inhibition Figure S5. Ncam1 and Oct4 spatial expression in iPSC colonies are altered with LY294002 Figure S6. Immunocytochemistry images of the 8-day spontaneous differentiation for control and PI 3-kinase inhibited conditions Figure S7. Dividing cells manual annotation and validation results Figure S8. Averaged mass spectra examples from images collected using MALDI-TOF of iPSC colonies Figure S9. Representative blank spectra collected on Bruker Rapiflex MALDI TOF and Bruker SolariX MALDI FTICR Table S1. Summary of all experimental conditions tested Table S2. Lipid annotations for 16 selected features found in all 3 spontaneous differentiation experiments S2. Cell culture HiPSC WTC11 cells (Coriell Institute, catalog ID GM25256, sex: male) were grown in 6-well plates. Wells were coated with 1mL Matrigel (GFR in Knockout D-MEM, 1:100) per well and incubated overnight. Cells were fed 2 mL media per well daily (basal MTeSR Plus media + supplement, 4:1). During passage, 0.5 mL accutase was added to each well and incubated for 3 min. Cells were lifted and collected into a 15 mL tube with excess DPBS (~3x times accutase used). Cells were centrifuged at 1000 rpm for 5 minutes, supernatant was removed, and the pellet was lifted in 2 mL media with 2 L Rock inhibitor. Cells were seeded at 100-200 K density in 2 mL media and 2 L Rock inhibitor was added for the first day after passage, then cells were fed as usual.
S3. Flow cytometry Eight control wells and 8 DZA-exposed wells underwent 0 to 7 days of spontaneous differentiation in 6-well plates, each well was reproduced 3 times. Cells were lifted with 0.5 mL accutase per a well of a 6-well plate, centrifuged at 1000 rpm for 5 minutes, supernatant was removed, and the pellet was lifted in 1 mL of 4% paraformaldehyde solution in PBS for cell fixation. After 10 min at room temperature cells were centrifuged again and resuspended in 1 mL of 0.3% Triton™ X-100 solution in PBS for permeabilization. After 15 min at room temperature cells were centrifuged and resuspended in 1 mL Odyssey Blocking Buffer for 1 hour at room temperature. Next, cells were centrifuged and resuspended in 1 mL Odyssey Blocking Buffer with 5 L mouse anti-human Oct4 primary antibody and were left at 4C overnight. After that, cells were centrifuged and resuspended in 1 mL PBS as a wash step. At this point, 0.5 L of cells from Day 0 sample were set aside as a negative control. Next, cells were centrifuged and resuspended in 1 mL Odyssey Blocking Buffer with 1 L antimouse Alexa Fluor Plus 488 secondary antibody for 30 minutes in the dark. Finally, after another wash step, cells were centrifuged and resuspended in 1 mL PBS and transferred to a FACS tube through the strainer cap. Samples were analyzed on BD FACSMelody™ Cell Sorter (BD Biosciences, San Jose, CA, USA) with excitation wavelength of 488 nm, detection in 515 nm-545 nm range. Data acquisition was performed with BD FACSChorus™ Software (BD Biosciences), data processing was performed with FlowJo™ (BD Biosciences). Oct4 gate was created so 99.9% of negative control fall into Oct4-negative category. Percentage of cells registered as Oct4-positive according to the gating was recorded for each sample.
S4. Immunocytochemistry Eight control wells and 8 wells exposed to 100 M LY294002 underwent 0 to 7 days of spontaneous differentiation on ITO-covered glass slides with a glued PDMS 8-well wall. Wells were washed with 100 L of PBS each (wash step) and then fixed with 100 L of 4% paraformaldehyde solution in PBS for 10 minutes. After 3 wash steps cells were permeabilized for 15 minutes with 100 L of 0.3% Triton™ X-100 solution in PBS. After a wash step, cells were blocked with 100 L of Odyssey Blocking Buffer for 1 hour at room temperature. Next, we diluted mouse anti-human Oct4 primary antibody at 1:200 ratio in Odyssey Blocking Buffer, added antihuman NL557-Conjugated Otx2 antibody at 1:100 ratio as well as anti-human Alexa Fluor 647-conjugated Pax6 antibody at 1:50 ratio. Cells were treated with 100 L of antibody mixture for 1 hour in the dark. Next, after 3 wash steps cells were treated with 100 L of Odyssey Blocking Buffer with Hoechst (1:1000) and anti-mouse Alexa Fluor Plus 488 secondary antibody (1:1000) for 30 minutes in the dark. Finally, after 3 wash steps 100 L of PBS was added to each well and cells were imaged with Nikon UltraVIEW VoX W1 Spinning Disk Confocal with sCMOS camera at 10x magnification (0.65 mm/px), 100 ms exposure and 100% laser power for all wavelengths.

S5. Nuclei Segmentation
The confocal image stained with Hoechst was also used to segment nuclei to extract the abundance data at each m/z value of interest on a cell-by-cell basis along with the corresponding fluorescence intensities from the confocal image. First, a local threshold was applied on the image at window sizes ranging from a fourth of the image to twice the size of the largest cell (parameter provided by user); this was done to obtain the most comprehensive binary mask of the nuclei. Next, the segmentation algorithm utilizes a multiscale Laplacian of Gaussian (LoG) blob detection algorithm [1] implemented using OpenCV [2] for Python to find nuclei seeding points. The LoG of the image is computed for each radius in a user-provided range of nuclear radii. The LoG is found by applying a Gaussian filter with a standard deviation of = √2 ⁄ , where r is the radius in pixels, and subsequently finding the second spatial derivative of the image. This results in a range of images containing several local minima at the center of each blob; the intensity of the minimum at each blob corresponds with how closely the actual radius of the blob matches the parameter of the Gaussian filter. Following normalization of each image by multiplying it by σ 2 , the minimum at each pixel across the stack of all calculated LoGs is taken. The center of each nucleus, or seed, can be found at the local minima of this resultant image, and applying the watershed transformation at these points on the binary mask yields the nuclear labels.

S4
Cell-to-cell connection distance is calculated by first finding a closest neighbor distance for each cell. Next, we clean these distances from outliersremove all values that are higher than mean plus three standard deviations. Finally, the cell-to-cell connection distance value is assigned the maximum of the cleaned distances array. This approach guarantees that every non-outlier cell will have at least one cell within the cell-to-cell connection radius. The cells within that radius are called neighbors. Neighbor-relative abundance was measured by first finding all neighbors for a cell of interest. Next, average abundance is calculated among the neighboring cells, and the self-value is divided by the average neighbor value.
To identify dividing cells, we segmented the nuclei images and calculated the following metrics for each cell: neighbor-relative Hoechst intensity, the area of the nucleus, and the distance to the nearest neighbor. A K-Means (K = 2) clustering algorithm from the scikit-learn library was then trained to classify the cells as either 'dividing' or 'not dividing' using those metrics. Out of the two resulting class centers, we designated a center with higher neighbor-relative Hoechst intensity, smaller area, and larger nearest neighbor distance to represent the dividing cells class. To estimate the classification accuracy, we manually annotated 3 10001000-pixel patches of Hoechststained colony image ( Figure S7) and used the algorithm to predict cell labels, yielding an accuracy of 98.8%. Sensitivity of 72.5% and a positive predictive value of 96.3% showed that this method is much more prone to false negatives than false positives, which is preferrable when data has a disproportionately high number of negative datapoints. Statistical significance was determined by the two-tailed Mann-Whitney U test with significance threshold of p-value < 0.05.
To detect the edge of the colony we multiplied cell-to-cell connection distance by the user-provided value (default is 3) to expand the cell's neighborhood. Cells that are on the edge of a colony can be distinguished by having at least one side with no neighbors in its network. To determine if the cell is on the edge, a cell's personal neighborhood is represented as a series of vectors connecting the center node cell and each of its neighbors. Next, we sort these vectors by their angles, and if any difference between two consecutive angles is greater than 2 ⁄ radians, the cell is labelled as an "edge" cell. Next, the edge distance metric can be derived by finding the distance between a given cell and the edge cells and taking the minimum value.
S7. Machine learning. A Shapiro-Wilk was performed to test that phospholipid abundance data is not distributed normally within one field of view. To train a classification tree, out of each day of differentiation we randomly selected 8000 cells and merged it into a training dataset of 64000 data points, with differentiation day number as a label and m/z values as features. We repeated this with a replica experiment obtaining a validation dataset of 64000 data points. Classification tree for the day of differentiation prediction was built using MATLAB built-in function fitctree with the number of tree splits limited by MaxNumSplits parameter set to 20. We used variable appending method for variable selection, iteratively adding those variables that increased the prediction accuracy on a validation dataset the most after being included into the analysis. After introducing first 5 variables this way the accuracy has stopped increasing, this indicated that these variables are the most predictive of the differentiation time point. Prior to partial least squares discriminant analysis (PLS-DA), to determine which cells are TRA-1-81, SSEA-1 or NCAM-1 positive we first applied k-means clustering analysis to the corresponding extracted fluorescence intensities using MATLAB built-in function kmeans with number of clusters equal 2 for each live stain individually. For PLS-DA in control experiments we did not observe a significant number of NCAM-1 positive cells and thus focused on TRA-1-81 and SSEA-1 stains. Each cell was assigned a positive or a negative label for both stains (TRA-1-81+ or − , SSEA-1+ or − ), as illustrated in Figure S3a. Next, we excluded double negative and double positive cells from the analysis as potential artifacts of staining and assigned all the remaining cells a pluripotent versus differentiated label, where pluripotent cells are positive in TRA-1-81 and negative in SSEA-1 S5 and vice versa for the differentiated cells. Similarly, for the PI 3-kinase inhibited experiment (100 M condition) we did not observe a significant number of TRA-1-81 positive cells and thus focused on NCAM-1 and SSEA-1 stains. Each cell was assigned a positive or a negative label for both stains (NCAM-1+ or − , SSEA-1+ or − ), as illustrated in Figure S3b; double positives and double negatives were excluded from the subsequent analysis. After all cells received their lineage label, we used SIMCA® software (Sartorius AG, Göttingen, Germany) for PLS-DA to predict the labels using m/z values as features. To select the best predictors, we used variable trimming: iteratively removing every variable that reduced prediction accuracy on a validation dataset. References    Class boundaries determined through k-means clustering are 15.51 for SSEA-1 and 14.94 for NCAM-1. Same class boundaries were applied to all days of the 100 M condition. C. Histogram of NCAM-1 fluorescence intensity for days 5, 6, and 7 combined. Arrow shows the class boundary determined by k-means from day 7 of the 100 M condition. Figure S4. Temporal and spatial changes induced by phosphatidylinositol 3-kinase inhibition. A. Top rowconfocal images of iPSC colonies undergoing differentiation for 7 days with addition of 35 M LY294002 on day 0, blue is Hoechst staining, green is TRA-181, red is SSEA-1, yellow is NCAM-1. Bottom row shows corresponding MALDI ion images for m/z 748.5 with blue colors representing low peak abundance and red representing high abundance. B. Temporal changes in mean phospholipid abundance during spontaneous differentiation based on the LY294002 dose. Error bars show 25th and 75th percentiles. Figure S5. Ncam1 and Oct4 spatial expression in iPSC colonies are altered with LY294002. Edge-independent patterns of Ncam1/Oct4 expression from immunofluorescence imaging are observed as early as day 2 and 3 of PI 3-kinase inhibited differentiation with 35 M or 100 M LY294002, in contrast to the control differentiation with vehicle. Scalebars are 1 mm. Figure S6. Immunocytochemistry images of the 8-day spontaneous differentiation for control and PI 3-kinase inhibited conditions. Cells were fixed and stained for Oct4 (green), Otx2 (red), and Pax6 (pink) in an independent set of experiments. A. Temporal changes of pluripotency markers expression in control versus PI 3-kinase inhibited samples. B. Difference in spatial patterns of pluripotency markers expression in control versus PI 3-kinase inhibited samples. Figure S7. Dividing cells manual annotation and validation results. Circled nuclei were manually annotated as dividing (109 cells total). Segmentation algorithm detected a total of 2846 cells and neighbor-relative Hoechst intensity, area, and nearest neighbor distance were calculated for each cell. Using those metrics, K-means clustering predicted the dividing cells with accuracy of 98.8%, sensitivity of 72.5% and a positive predictive value of 96.3%.  Figure S9. Representative blank spectra collected on A) Bruker Rapiflex MALDI TOF and B) Bruker SolariX MALDI FTICR. Blank spectra collected by sampling a small area on the experimental slide that does not contain cells but does contain norharmane matrix.