Identification of Water-Soluble Polymers through Machine Learning of Fluorescence Signals from Multiple Peptide Sensors

Recently, there has been growing concern about the discharge of water-soluble polymers (especially synthetic polymers) into the environment. Therefore, the identification of water-soluble polymers in water samples is becoming increasingly crucial. In this study, a chemical tongue system to simply and precisely identify water-soluble polymers using multiple fluorescently responsive peptide sensors was demonstrated. Fluorescence spectra obtained from the mixture of each peptide sensor and water-soluble polymer were changed depending on the combination of the polymer species and peptide sensors. Water-soluble polymers were successfully identified through the supervised or unsupervised machine learning of multidimensional fluorescence signals from the peptide sensors.


TABLE OF CONTENTS Page
Peptide Synthesis.Free N-terminal and amidated C-terminal peptides were synthesized by standard solid-phase synthesis using an Fmoc-based strategy based on a standard protocol. 1In brief, the peptide chains were assembled on NovaSynTGR resin (amino group of 0.25 mmol g -1 ) using Fmoc amino acid derivatives.To cleave the peptides from the resin and remove the protecting groups from their side chains, the resins were treated with trifluoroacetic acid (TFA)/1,2-ethanedithiol/thioanisole/m-cresol (10/0.75/0.75/0.25,v/v/v/v) for 3 h.The peptides were purified by reverse-phase high-performance liquid chromatography (HPLC; ELITE LaChrom, Hitachi High-Tech Corporation, Tokyo, Japan) using a COSMOSIL 5C18-AR-300 packed column (20 × 250 mm, Nacalai Tesque, Inc., Kyoto, Japan) with a linear gradient from 99.9% H2O/0.1% TFA to 99.9% acetonitrile/0.1% TFA at a flow rate of 6 mL min -1 .For ANM S4 conjugation, ANM (5.7 mg, 18 μmol) was mixed with the corresponding peptide with the additional C-terminal Cys (14 μmol) dissolved in dimethyl sulfoxide and the mixture was stirred for 2 h at ambient temperature.The ANM-introduced peptides were purified by HPLC.
PNIPAM Synthesis.The polymers used in previous studies were used.Circular Dichroism (CD) Measurements.The CD spectra of the peptide sensors dissolved in sodium-phosphate-buffer solutions (10 mM phosphate, pH 7.4) at a concentration of 100 µM were recorded on a CD spectrometer (J-725, JASCO) using a UV cell with 0.2 cm of optical path length under a N2 atmosphere at 25 °C using a wavelength range of 190-250 nm with a resolution of 0.5 nm and a scanning speed of 50 nm min -1 .The CD spectra were co-added for 4 times.
Polymer Classification and Identification.LDA and HCA were performed using the SYSTAT 13 program (Systat Software Inc., San Jose, USA).For LDA, the fluorescence signals for each polymer were transferred to the canonical scores setting with the polymer species as the classifying variable.The canonical scores were plotted with 95% confidence ellipses.HCA dendrograms were created based on Euclidean distances using the Ward method, and a dataset was standardized before analysis using the following equation: z = (x -μ)/σ, where z is the standardized score, x is the raw score, μ is the population mean, and σ is the population standard deviation.For LOOCV, one dataset for a certain polymer was excluded to use as a test dataset.The test data were classified into an ellipse generated by the remaining training datasets according to their shortest Mahalanobis distances.This process was performed for all datasets (namely, one hundred datasets).
For the hold-out method, three datasets for each polymer were randomly selected from ten datasets as the test datasets.The LDA score plot and ellipses with a 95% confidence level were produced using the remaining seven training datasets.The test data were classified into an ellipse with the shortest Mahalanobis distance.For SKCV, one-third (3-fold) and one-fifth (5-fold) datasets for each polymer were used as the test datasets, and all test data were classified in the same way as LOOCV.

synthesized PNIPAM was characterized by 1 H
nuclear magnetic resonance spectroscopy (AVANCE III HD500, Bruker Corporation, Yokohama, Japan) in chloroform-d6.The molecular weight of PNIPAM was determined by SEC (HLC-8120 Gel Permeation Chromatography System, Tosoh Corporation, Tokyo, Japan) equipped with TSKgel GMHXL and G2000HXL columns (Tosoh Corporation, Tokyo, Japan) through ultraviolet and refractive index detection using N,Ndimethylformamide containing 10 mM LiBr as an eluent at a flow rate of 1.0 mL min -1 at 40 °C.Fluorescence Measurement.Polymers were dissolved in 35 μL of BR buffer (pH 7.0), pHadjusted BR buffer solutions prepared from phosphoric acid (40 mM), acetic acid (40 mM), and boric acid (40 mM).Thirty-five microliters of Peptide-ANM (1 μM in the same solvent) was mixed with the polymer solutions, and the mixture solutions were incubated for 90-110 min at 25 °C.The fluorescence spectra of the mixture solutions in a quartz cell (3 × 3 × 35 mm) at an excitation S5 wavelength of 350 nm were obtained using a fluorescence spectrophotometer (FP-6500, JASCO Corporation, Tokyo, Japan) at 25 °C.

Figure S1 .S7Figure S2 .Figure S3 .
Figure S1.Fluorescence spectra of the peptide sensors in the presence or absence of water-soluble polymers.The concentrations of each peptide sensor and each polymer were 1 µM and 10 mg L -1 , respectively.

Figure S5 .
Figure S5.HCA for the fluorescence signals.

Figure S7 .
Figure S7.Canonical scores for thirty blind samples on the LDA score plot, which were prepared using the seven datasets randomly selected from ten datasets for each polymer.See FigureS6for the plot used for 95% confidence ellipses.